15 research outputs found

    Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

    Full text link
    Recent studies on pronunciation scoring have explored the effect of introducing phone embeddings as reference pronunciation, but mostly in an implicit manner, i.e., addition or concatenation of reference phone embedding and actual pronunciation of the target phone as the phone-level pronunciation quality representation. In this paper, we propose to use linguistic-acoustic similarity to explicitly measure the deviation of non-native production from its native reference for pronunciation assessment. Specifically, the deviation is first estimated by the cosine similarity between reference phone embedding and corresponding acoustic embedding. Next, a phone-level Goodness of pronunciation (GOP) pre-training stage is introduced to guide this similarity-based learning for better initialization of the aforementioned two embeddings. Finally, a transformer-based hierarchical pronunciation scorer is used to map a sequence of phone embeddings, acoustic embeddings along with their similarity measures to predict the final utterance-level score. Experimental results on the non-native databases suggest that the proposed system significantly outperforms the baselines, where the acoustic and phone embeddings are simply added or concatenated. A further examination shows that the phone embeddings learned in the proposed approach are able to capture linguistic-acoustic attributes of native pronunciation as reference.Comment: Accepted by ICASSP 202

    An ASR-free Fluency Scoring Approach with Self-Supervised Learning

    Full text link
    A typical fluency scoring system generally relies on an automatic speech recognition (ASR) system to obtain time stamps in input speech for either the subsequent calculation of fluency-related features or directly modeling speech fluency with an end-to-end approach. This paper describes a novel ASR-free approach for automatic fluency assessment using self-supervised learning (SSL). Specifically, wav2vec2.0 is used to extract frame-level speech features, followed by K-means clustering to assign a pseudo label (cluster index) to each frame. A BLSTM-based model is trained to predict an utterance-level fluency score from frame-level SSL features and the corresponding cluster indexes. Neither speech transcription nor time stamp information is required in the proposed system. It is ASR-free and can potentially avoid the ASR errors effect in practice. Experimental results carried out on non-native English databases show that the proposed approach significantly improves the performance in the "open response" scenario as compared to previous methods and matches the recently reported performance in the "read aloud" scenario.Comment: Accepted by ICASSP 202

    Throughput Maximization for the Full-Duplex Two-Way Relay System with Energy Harvesting

    No full text
    The full-duplex technique can improve the transmission capacity of the communication systems, and energy harvesting (EH) is a promising operation to prolong the lifespan of a wireless node by utilizing the radio-frequency signals. In this paper, the throughput performance of a full-duplex two-way energy EH capable relay system is investigated. In particular, a practical EH protocol, named the time-switching-based relaying (TSR) protocol, is used for EH and the decode-and-forward (DF) policy for information transmission. The outage probability is successfully obtained, and the corresponding system throughput for TSR protocol can be derived by it. The derived throughput is a function of different system parameters, including the time-switching (TS) ratio, power allocation ratio, and the length of the communication time slot. Meanwhile, the throughput is used to characterize a joint time and power allocation scheme for the system, and we aim to find the optimal time and power allocation to achieve the optimal throughput. Due to the existence of three variables and the integral form of throughput expression, an optimization for the throughput is difficult. However, a modified simulated annealing-based search (SABS) algorithm can be used to optimize the throughput. The modified SABS algorithm overcomes being highly impacted by the initial point, and derives the optimal solution fast. Simulation results show that the analytical throughput expression is related with the TS ratio, power allocation ratio, and the length of the communication time slot. The analytical curve of the throughput matches with the simulated one well, which shows that the obtained analytical system throughput for the TSR protocol is valid. Meanwhile, the proposed modified SABS algorithm could be used to derive accurate throughput when SNR is higher than 10 dB

    Adsorption and Photo-catalytic Properties of Congo Red by Cobalt Doped Porous ZnO Prepared Through Hydrothermal Method

    No full text
    ZnO and cobalt doped ZnO were prepared by hydrothermal method with zinc acetate dihydrate, cobalt acetate tetrahydrate and urea as raw materials and sodium citrate as surface modifier. Congo red (CR) was used as the pollutant model for adsorption and photo-catalytic experiments. Under the same conditions, the adsorption effect of cobalt doped ZnO with different urea content on CR was investigated, and the optimum urea content was determined. Under the same conditions, the adsorption and photocatalytic properties of cobalt doped ZnO prepared with the best urea content at different annealing temperatures were studied. According to the analysis of experimental data, the adsorption effect of Co doped ZnO with urea content of 8mmol and annealing temperature of 300℃ is the best, and the photocatalytic effect is also the best under the same conditions

    A Novel RFID Authentication Protocol Based on Reconfigurable RRAM PUF

    No full text
    Radio frequency identification technology (RFID) has empowered a wide variety of automation industries. Aiming at the current light-weight RFID encryption scheme with limited information protection methods, combined with the physical unclonable function (PUF) composed of resistive random access memory (RRAM), a new type of high-efficiency reconfigurable strong PUF circuit structure is proposed in this paper. Experimental results show that the proposed PUF shows an almost ideal value (50%) of inter-chip hamming distance (HD) (µ/σ = 0.5001/0.0340) among 1000 PUF keys, and intra-chip HD results are very close to the ideal value (0). The bit error rate (BER) is as low as 3.8×10−6 across one million challenges. Based on the RRAM PUF, we propose and implement a light weight RFID authentication protocol. By virtue of RRAM’s model ability, the protocol replaces the One-way Hash Function with a response chain mutual encryption algorithm. The results of test and analysis show that the protocol can effectively resist multiple threats such as physical attacks, replay attacks, tracking attacks and asynchronous attacks, and has good stability. At the same time, based on RRAM’s unique resistance variability, PUF also has the advantage of being reconfigurable, providing good security for RFID tags

    Coupling of kenaf Biochar and Magnetic BiFeO3 onto Cross-Linked Chitosan for Enhancing Separation Performance and Cr(VI) Ions Removal Efficiency

    No full text
    Cr(VI) contamination has posed great threat to both the ecosystem and human health for its carcinogenic and mutagenic nature. A highly effective adsorbent for the removal of Cr(VI) was prepared and its adsorption mechanism was thoroughly discussed in this study. In detail, magnetic BiFeO3 and kenaf biochar were loaded on cross-linked chitosan to obtain chitosan-kenaf biochar@BiFeO3 (CKB) for improving adsorption capacity towards Cr(VI). The adsorption process of Cr(VI) onto CKB was evaluated as a function of the pH, the existence of competing ions, the initial concentration of Cr(VI) and contact time. The results show that CKB exhibits the highest adsorption capacity under the optimal pH 2.0. The presence of competing ions such as Ca2+, NO3−, SO42−, and Cl− decreases the adsorption capacity; among them, Ca2+ and NO3− show the greatest hindrance. By studying the effect of initial Cr(VI) concentration on the adsorption capacity, it was found that CKB in the solution was enough to remove Cr(VI) for all treatments (10–200 mg/L). The adsorption experimental data were well fitted with pseudo-first-order model, suggesting that chemisorption is not the dominant rate-limiting step. Freundlich isotherm model can better explain the adsorption process, indicating a non-ideal adsorption towards Cr(VI) on a heterogeneous surface of CKB. A 25-1 Fractional Factorial Design (FFD) showed that pH and initial concentration of Cr(VI) have significant influence on Cr(VI) adsorption in our reaction system. In general, excellent adsorption efficiency of CKB indicates that it may be a good candidate for the remediation of Cr(VI)-contaminating wastewater
    corecore